This paper provides estimation and inference methods for an identified set's boundary (i.e., support function) where the selection among a very large number of covariates is based on modern regularized tools. I characterize the boundary using a semiparametric moment equation. Combining Neyman-orthogonality and sample splitting ideas, I construct a root-N consistent, uniformly asymptotically Gaussian estimator of the boundary and propose a multiplier bootstrap procedure to conduct inference. I apply this result to the partially linear model, the partially linear IV model and the average partial derivative with an interval-valued outcome.
translated by 谷歌翻译
This paper provides estimation and inference methods for a conditional average treatment effects (CATE) characterized by a high-dimensional parameter in both homogeneous cross-sectional and unit-heterogeneous dynamic panel data settings. In our leading example, we model CATE by interacting the base treatment variable with explanatory variables. The first step of our procedure is orthogonalization, where we partial out the controls and unit effects from the outcome and the base treatment and take the cross-fitted residuals. This step uses a novel generic cross-fitting method we design for weakly dependent time series and panel data. This method "leaves out the neighbors" when fitting nuisance components, and we theoretically power it by using Strassen's coupling. As a result, we can rely on any modern machine learning method in the first step, provided it learns the residuals well enough. Second, we construct an orthogonal (or residual) learner of CATE -- the Lasso CATE -- that regresses the outcome residual on the vector of interactions of the residualized treatment with explanatory variables. If the complexity of CATE function is simpler than that of the first-stage regression, the orthogonal learner converges faster than the single-stage regression-based learner. Third, we perform simultaneous inference on parameters of the CATE function using debiasing. We also can use ordinary least squares in the last two steps when CATE is low-dimensional. In heterogeneous panel data settings, we model the unobserved unit heterogeneity as a weakly sparse deviation from Mundlak (1978)'s model of correlated unit effects as a linear function of time-invariant covariates and make use of L1-penalization to estimate these models. We demonstrate our methods by estimating price elasticities of groceries based on scanner data. We note that our results are new even for the cross-sectional (i.i.d) case.
translated by 谷歌翻译
社交网络数据评估的自动化是自然语言处理的经典挑战之一。在共同199年的大流行期间,关于了解健康命令的态度,公共信息中的采矿人们的立场变得至关重要。在本文中,作者提出了基于变压器体系结构的预测模型,以对Twitter文本中的前提进行分类。这项工作是作为2022年社交媒体挖掘(SMM4H)研讨会的一部分完成的。我们探索了现代变压器的分类器,以便构建管道有效地捕获推文语义。我们在Twitter数据集上的实验表明,在前提预测任务的情况下,罗伯塔(Roberta)优于其他变压器模型。该模型在ROC AUC值0.807方面实现了竞争性能,而F1得分为0.7648。
translated by 谷歌翻译
旅行时间估计问题被广泛认为是现代物流的基本挑战。道路的空间方面与地面运输的时间动态之间的互连的复杂性仍然可以保留一个可以尝试的区域。但是,当前累积数据的总数鼓励了学习模型的构建,这些模型具有明显优于早期解决方案的观点。为了解决旅行时间估计的问题,我们提出了一种基于变压器体系结构-Transtte的新方法。
translated by 谷歌翻译
支持II社区的当前趋势,我们提出了一个称为融合大脑的AI Journey 2021挑战,这些挑战是融合大脑,该挑战是使普通架构处理不同的方式(即图像,文本和代码),并解决视觉和语言的多个任务。融合脑挑战https://github.com/sberbank- ai/fusion_brain_aij2021结合了以下特定任务:code2code翻译,手写文本识别,零拍摄对象检测和视觉问题应答。我们为每个任务创建了数据集以测试参与者的提交。此外,我们在俄语和英语中开设了一个新的手写数据集,其中包含94,130对图像和文本。DataSet的俄罗斯部分是世界上最大的俄罗斯手写数据集。我们还提出了基线解决方案和相应的特定于任务特定解决方案以及整体指标。
translated by 谷歌翻译
深度神经网络通过解决了许多以前被视为更高人类智能的任务解锁了广泛的新应用。实现这一成功的一个发展之一是由专用硬件提供的计算能力提升,例如图形或张量处理单元。但是,这些不利用神经网络等并行性和模拟状态变量的基本特征。相反,它们模拟了依赖于二元计算的神经网络,这导致不可持续的能量消耗和相对低的速度。完全平行和模拟硬件承诺克服这些挑战,但模拟神经元噪声的影响及其传播,即积累,威胁到威胁这些方法无能为力。在这里,我们首次确定噪声在训练的完全连接层中包含噪声非线性神经元的深神经网络中的噪声传播。我们研究了添加剂和乘法以及相关和不相关的噪声,以及开发预测因对称深神经网络的任何层中的噪声水平的分析方法,或者在训练中培训的对称深神经网络或深神经网络。我们发现噪声累积通常绑定,并且添加附加网络层不会使信号与超出限制的信噪比恶化。最重要的是,当神经元激活函数具有小于单位的斜率时,可以完全抑制噪声累积。因此,我们开发了在模拟系统中实现的完全连接的深神经网络中的噪声框架,并识别允许工程师设计噪声弹性新型神经网络硬件的标准。
translated by 谷歌翻译